Understanding the Propagation of Hard Errors to Software and its Implications for Resilient System Design
نویسندگان
چکیده
With continued CMOS scaling, future shipped hardware will be increasingly vulnerable to in-the-field faults. To be broadly deployable, the hardware reliability solution must incur low overheads, precluding use of expensive redundancy. We explore a co-designed hardware-software solution that treats most hardware faults as software bugs and leverages common mechanisms for hardware and software reliability, thereby amortizing some of the overhead. Fundamental to such a solution is a characterization of how hardware faults in different microarchitectural structures of a modern processor propagate through the application and OS. This paper aims to provide such a characterization, identify lowcost detection methods to intercept fault propagation, and to provide guidelines for a complete co-designed reliability solution. We focus on hard faults because they are increasingly important and have different system implications than the much studied transients. We achieve our goals through fault injection experiments with a microarchitecture level full system timing simulator.
منابع مشابه
Design and Implementation of a Software System for Detecting Orthographical or Morphological Errors in Persian Words
This paper presents a new method for analyzing words in the Persian language context to find orthographical and structural errors regardless of the meaning. This technique tokenizes each word in a statement then tries to detect the kind of word, and analyses its correctness in terms of orthography and morphology by means of a lexicon. It should be noted that some words in the Persian language h...
متن کاملAn Effective Attack-Resilient Kalman Filter-Based Approach for Dynamic State Estimation of Synchronous Machine
Kalman filtering has been widely considered for dynamic state estimation in smart grids. Despite its unique merits, the Kalman Filter (KF)-based dynamic state estimation can be undesirably influenced by cyber adversarial attacks that can potentially be launched against the communication links in the Cyber-Physical System (CPS). To enhance the security of KF-based state estimation, in this paper...
متن کاملAn approach to fault detection and correction in design of systems using of Turbo codes
We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...
متن کاملImplications of the Imperfect Deposit Market Structure for Micro and Macro Discretionary Prudential Policies
The aim of this study is to theoretically investigate the role of the bank deposit market structure in how effective micro and macro prudential policies in determining the regulatory capital of banks in combination with monetary policy. To achieve this, a partial equilibrium analytical framework has been developed that includes rational economic entities and the possibility of contagion risk in...
متن کاملTypical Ka band Satellite Beacon Receiver Design for Propagation Experimentation
This paper presents the design and simulation of a typical Ka band satellite beacon receiver for propagation experimentation. Using satellite beacon signal as a reference signal in satellite wave propagation study, is one of the most important methods. Satellite beacons are frequently available for pointing large antennas, but such signals can be used for measuring the effect of natural phenome...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007